National Data Management Center¶
- Data Analytics, Modeling and Visualization
Importing the relevant libraries¶
Below are all the requiered pacakges for the analysis and modeling
A. Loading the data¶
| packet_version_id | id_ver_nmb | champs_id | dp_001 | dp_002 | dp_003 | dp_004 | dp_005 | dp_006 | dp_007 | ... | dpf_012___ch00040 | dpf_012___ch00041 | dpf_012___ch00042 | dpf_012___ch00043 | dpf_012___ch01424 | dpf_012___ch01875 | dpf_012___ch00010 | dpf_013 | dpf_014 | crf_060302_decode_panel_feedback_form_complete | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ETAA00002_01_01 | 2.0.0 | ETAA00002 | 5 | 1 | 2 | 3 | 4.0 | 5.0 | 6.0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Tseyon Tesfaye Clinical | None | 2 |
| 1 | ETAA00004_01_02 | 2.0.0 | ETAA00004 | 5 | 1 | 2 | 3 | 4.0 | 5.0 | 6.0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Adugna (SBS team), Tigistu (counselor), Tseyon... | NaN | 2 |
2 rows × 381 columns
B. Shape of the dataset¶
(444, 381)
| champs_id | dp_013 | dp_108 | dp_118 | |
|---|---|---|---|---|
| 0 | ETAA00002 | CH00716 | Undetermined | Undetermined |
| 1 | ETAA00004 | CH00716 | Undetermined | Undetermined |
| 2 | ETAA00005 | CH00716 | Intrauterine hypoxia | Fetus and newborn affected by other forms of p... |
| 3 | ETAA00008 | CH00719 | Severe acute malnutrition - Kwashiorkor | NaN |
| 4 | ETAA00009 | CH01406 | Sepsis | NaN |
(444, 4)
<class 'pandas.core.frame.DataFrame'> RangeIndex: 444 entries, 0 to 443 Data columns (total 4 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 champs_id 444 non-null object 1 dp_013 444 non-null object 2 dp_108 444 non-null object 3 dp_118 197 non-null object dtypes: object(4) memory usage: 14.0+ KB
| champs_id | dp_013 | dp_108 | dp_118 | |
|---|---|---|---|---|
| count | 444 | 444 | 444 | 197 |
| unique | 444 | 6 | 97 | 97 |
| top | ETAA00002 | CH00716 | Intrauterine hypoxia | Preeclampsia |
| freq | 1 | 239 | 148 | 36 |
champs_id 0 dp_013 0 dp_108 0 dp_118 247 dtype: int64
C. Enumerate the columns of the dataset¶
Column 0: champs_id Column 1: dp_013 Column 2: dp_108 Column 3: dp_118
D. Rename columns¶
- columns are renamed here according to the direction for better undersanding
Updating values¶
- in this section some of the coded values are updated, particularly the value of
Case Type.
| CHAMPS_ID | Case Type | Underlying Cause | Maternal Factor | |
|---|---|---|---|---|
| 399 | ETAA01154 | Stillbirth | Intrauterine hypoxia | NaN |
| 334 | ETAA01007 | Stillbirth | Intrauterine hypoxia | Fetus and newborn affected by other malpresent... |
Null propertion in each column¶
- Null values in each column are identified
arround 55 % of the values in the
Maternal Factorcolumn are null
CHAMPS_ID 0.000000 Case Type 0.000000 Underlying Cause 0.000000 Maternal Factor 0.556306 dtype: float64
2. Descriptive Data analysis¶
- Based on the given decoded table and the dictionary, descriptive data analysis are on the datasets
A. What are the magnitude and proportion of each of the infant underlying cause for child death?¶
Identify driving factor for child death:¶
Here the column is
Underlying Causeused to find magnitude and proportion of each of the infant.
Magnitude Proportion (%) Intrauterine hypoxia 148 33.333333 Birth asphyxia 33 7.432432 Undetermined 28 6.306306 Severe acute malnutrition 24 5.405405 Craniorachischisis 16 3.603604 ... ... ... Severe acute malnutrition-Kwashiorkor 1 0.225225 severe acute malnutrition, Marasmic Kwashiorkor 1 0.225225 Severe acute malnutrition - Marasmic kwashiorkor 1 0.225225 Congenital CMV infection 1 0.225225 Bacterial sepsis of the newborn 1 0.225225 [97 rows x 2 columns]
*****Insight from the above | underlying cause*****
- the above descriptive summary and the bar graph shows clear and precise information. accordingly here is few summary given below to make it simplify.
Intrauterine hypoxia, is the most and far highest underlying cause for infant death and it covers 33% of the total deaths.Birth asphyxiais the second most underlying cause the for the infant death and it covers 7% of the total deaths.- 6% of the infant death is
Undeterminedtheir underlying cuses which is ranked thirdly accordingto the given dataset.- Next,
Severe acute malnutritionis the cause for the infant death which is 5%.
- In summary considering the magnitude and proportion of the underlying cause, there should be a special attention to reduce the infant death caused by
Intrauterine hypoxia.
B. What are the proportion and magnitude of the maternal factors contributing for child death?¶
Identify driving factor to for child death:¶
Here the column is
Maternal Factorused to find magnitude and proportion of each of the infant death.
Magnitude Proportion (%) Preeclampsia 36 18.274112 Twin pregnancy 12 6.091371 Fetus and newborn affected by other forms of pl... 11 5.583756 Eclampsia 9 4.568528 Fetus and newborn affected by other forms of pl... 5 2.538071 ... ... ... Fetus and newborn affected by oligohydramnios 1 0.507614 Fetus and newborn affected by maternal diabetes 1 0.507614 Fetus and newborn affected by maternal infectio... 1 0.507614 Fetus and newborn affected by multiple pregnanc... 1 0.507614 Pre-labor rapture of membrane 1 0.507614 [97 rows x 2 columns]
*****Insight from the above | Maternal Factor*****
- The above descriptive summary and the bar graph shows clear and precise information about the contribution of Maternal Factor to infant death . Accordingly here is few summary given below to make it understandable.
Preeclampsia, is the most and far highest maternal factor for infant death and it covers 18% of the total deaths contribution.Twin pregnancyis the second most maternal factor the for the infant death and it covers 6% of the total deaths.Fetus and newborn affected by other formscontributes 5% of the infant death which is ranked thirdly accordingto the given dataset.- Next,
Eclampsiais the maternal factor for the infant death which contributes around 4%.
- In summary considering the magnitude and proportion of the Maternal Factor contribution to infant death, it requiers a spect special attention to reduce the infant death contributed by
Preeclampsia.
C.What are the proportion of the child death by the case type¶
Magnitude Proportion (%) Stillbirth 239 53.828829 Death in the first 24 hours 69 15.540541 Early Neonate (1 to 6 days) 49 11.036036 Child (12 months to less than 60 months) 42 9.459459 Infant (28 days to less than 12 months) 27 6.081081 Late Neonate (7 to 27 days) 18 4.054054
*****Insight from the above | Case Type*****
- The above descriptive summary and the pie chart shows clear and precise information about the case type in relation to infant death . Accordingly, here is few summary given below to make it ease.
Stillbirth, is the first most highest case type in infant death and it accounts 53% of the total deaths cases.Death in the first 24 hoursis the second most case type in the infant death and it accounts 15% of the total deaths cases.Early Neonate (1 to 6 days)accounts 11% in the infant death which is ranked thirdly accordingto the given dataset.- Next,
Child (12 months to less than 60 months)accounts 9% in the infant death.
- In summary considering the magnitude and proportion of the case type in relation to infant death, it requiers a specila research and study to mitigate the problem behind the case type
Stillbirth.
3. Correlation analysis¶
Using correlation or Heat Maps, show how each of the infant underlying conditions and maternal factors are correlated to the top three causes of the child death identified above under 2(A)
Prepare Data for Correlation Analysis:¶
- First Create dummy variables for categorical columns.
- Choose appropriet encoding technique, like
one-hotencoding.- Apply correlation matrix
Underlying Cause_Birth asphyxia \
Underlying Cause_Birth asphyxia 1.000000
Underlying Cause_Intrauterine hypoxia -0.674476
Underlying Cause_Undetermined -0.170310
Maternal Factor_Abruptio placenta 0.160128
Maternal Factor_Abruption placenta 0.058061
... ...
Maternal Factor_Severe preeclampsia -0.030024
Maternal Factor_Twin pregnancy -0.052255
Maternal Factor_Undetermined -0.042563
Maternal Factor_Uterine rupture 0.092219
Maternal Factor_preeclampsia -0.030024
Underlying Cause_Intrauterine hypoxia \
Underlying Cause_Birth asphyxia -0.674476
Underlying Cause_Intrauterine hypoxia 1.000000
Underlying Cause_Undetermined -0.612640
Maternal Factor_Abruptio placenta -0.108003
Maternal Factor_Abruption placenta -0.011007
... ...
Maternal Factor_Severe preeclampsia 0.044515
Maternal Factor_Twin pregnancy 0.077475
Maternal Factor_Undetermined -0.153107
Maternal Factor_Uterine rupture -0.045001
Maternal Factor_preeclampsia 0.044515
Underlying Cause_Undetermined \
Underlying Cause_Birth asphyxia -0.170310
Underlying Cause_Intrauterine hypoxia -0.612640
Underlying Cause_Undetermined 1.000000
Maternal Factor_Abruptio placenta -0.027271
Maternal Factor_Abruption placenta -0.047464
... ...
Maternal Factor_Severe preeclampsia -0.027271
Maternal Factor_Twin pregnancy -0.047464
Maternal Factor_Undetermined 0.249914
Maternal Factor_Uterine rupture -0.038661
Maternal Factor_preeclampsia -0.027271
Maternal Factor_Abruptio placenta \
Underlying Cause_Birth asphyxia 0.160128
Underlying Cause_Intrauterine hypoxia -0.108003
Underlying Cause_Undetermined -0.027271
Maternal Factor_Abruptio placenta 1.000000
Maternal Factor_Abruption placenta -0.008367
... ...
Maternal Factor_Severe preeclampsia -0.004808
Maternal Factor_Twin pregnancy -0.008367
Maternal Factor_Undetermined -0.006816
Maternal Factor_Uterine rupture -0.006816
Maternal Factor_preeclampsia -0.004808
Maternal Factor_Abruption placenta \
Underlying Cause_Birth asphyxia 0.058061
Underlying Cause_Intrauterine hypoxia -0.011007
Underlying Cause_Undetermined -0.047464
Maternal Factor_Abruptio placenta -0.008367
Maternal Factor_Abruption placenta 1.000000
... ...
Maternal Factor_Severe preeclampsia -0.008367
Maternal Factor_Twin pregnancy -0.014563
Maternal Factor_Undetermined -0.011862
Maternal Factor_Uterine rupture -0.011862
Maternal Factor_preeclampsia -0.008367
Maternal Factor_Antepartum hemorrhage \
Underlying Cause_Birth asphyxia 0.092219
Underlying Cause_Intrauterine hypoxia -0.045001
Underlying Cause_Undetermined -0.038661
Maternal Factor_Abruptio placenta -0.006816
Maternal Factor_Abruption placenta -0.011862
... ...
Maternal Factor_Severe preeclampsia -0.006816
Maternal Factor_Twin pregnancy -0.011862
Maternal Factor_Undetermined -0.009662
Maternal Factor_Uterine rupture -0.009662
Maternal Factor_preeclampsia -0.006816
Maternal Factor_Chorioamnionitis \
Underlying Cause_Birth asphyxia 0.092219
Underlying Cause_Intrauterine hypoxia -0.045001
Underlying Cause_Undetermined -0.038661
Maternal Factor_Abruptio placenta -0.006816
Maternal Factor_Abruption placenta -0.011862
... ...
Maternal Factor_Severe preeclampsia -0.006816
Maternal Factor_Twin pregnancy -0.011862
Maternal Factor_Undetermined -0.009662
Maternal Factor_Uterine rupture -0.009662
Maternal Factor_preeclampsia -0.006816
Maternal Factor_Cord prolapse \
Underlying Cause_Birth asphyxia -0.030024
Underlying Cause_Intrauterine hypoxia 0.044515
Underlying Cause_Undetermined -0.027271
Maternal Factor_Abruptio placenta -0.004808
Maternal Factor_Abruption placenta -0.008367
... ...
Maternal Factor_Severe preeclampsia -0.004808
Maternal Factor_Twin pregnancy -0.008367
Maternal Factor_Undetermined -0.006816
Maternal Factor_Uterine rupture -0.006816
Maternal Factor_preeclampsia -0.004808
Maternal Factor_Eclampsia \
Underlying Cause_Birth asphyxia -0.007677
Underlying Cause_Intrauterine hypoxia 0.061015
Underlying Cause_Undetermined -0.073217
Maternal Factor_Abruptio placenta -0.012907
Maternal Factor_Abruption placenta -0.022465
... ...
Maternal Factor_Severe preeclampsia -0.012907
Maternal Factor_Twin pregnancy -0.022465
Maternal Factor_Undetermined -0.018298
Maternal Factor_Uterine rupture -0.018298
Maternal Factor_preeclampsia -0.012907
Maternal Factor_Eclampsia /HELLP Syndrome \
Underlying Cause_Birth asphyxia -0.030024
Underlying Cause_Intrauterine hypoxia 0.044515
Underlying Cause_Undetermined -0.027271
Maternal Factor_Abruptio placenta -0.004808
Maternal Factor_Abruption placenta -0.008367
... ...
Maternal Factor_Severe preeclampsia -0.004808
Maternal Factor_Twin pregnancy -0.008367
Maternal Factor_Undetermined -0.006816
Maternal Factor_Uterine rupture -0.006816
Maternal Factor_preeclampsia -0.004808
... \
Underlying Cause_Birth asphyxia ...
Underlying Cause_Intrauterine hypoxia ...
Underlying Cause_Undetermined ...
Maternal Factor_Abruptio placenta ...
Maternal Factor_Abruption placenta ...
... ...
Maternal Factor_Severe preeclampsia ...
Maternal Factor_Twin pregnancy ...
Maternal Factor_Undetermined ...
Maternal Factor_Uterine rupture ...
Maternal Factor_preeclampsia ...
Maternal Factor_Pre-labour preterm rupture of membranes \
Underlying Cause_Birth asphyxia -0.030024
Underlying Cause_Intrauterine hypoxia 0.044515
Underlying Cause_Undetermined -0.027271
Maternal Factor_Abruptio placenta -0.004808
Maternal Factor_Abruption placenta -0.008367
... ...
Maternal Factor_Severe preeclampsia -0.004808
Maternal Factor_Twin pregnancy -0.008367
Maternal Factor_Undetermined -0.006816
Maternal Factor_Uterine rupture -0.006816
Maternal Factor_preeclampsia -0.004808
Maternal Factor_Precipitated labour \
Underlying Cause_Birth asphyxia -0.030024
Underlying Cause_Intrauterine hypoxia 0.044515
Underlying Cause_Undetermined -0.027271
Maternal Factor_Abruptio placenta -0.004808
Maternal Factor_Abruption placenta -0.008367
... ...
Maternal Factor_Severe preeclampsia -0.004808
Maternal Factor_Twin pregnancy -0.008367
Maternal Factor_Undetermined -0.006816
Maternal Factor_Uterine rupture -0.006816
Maternal Factor_preeclampsia -0.004808
Maternal Factor_Preeclampsia \
Underlying Cause_Birth asphyxia -0.032492
Underlying Cause_Intrauterine hypoxia 0.132202
Underlying Cause_Undetermined -0.141664
Maternal Factor_Abruptio placenta -0.024974
Maternal Factor_Abruption placenta -0.043466
... ...
Maternal Factor_Severe preeclampsia -0.024974
Maternal Factor_Twin pregnancy -0.043466
Maternal Factor_Undetermined -0.035404
Maternal Factor_Uterine rupture -0.035404
Maternal Factor_preeclampsia -0.024974
Maternal Factor_Premature rupture of membranes, onset of labour after 24 hours \
Underlying Cause_Birth asphyxia -0.030024
Underlying Cause_Intrauterine hypoxia 0.044515
Underlying Cause_Undetermined -0.027271
Maternal Factor_Abruptio placenta -0.004808
Maternal Factor_Abruption placenta -0.008367
... ...
Maternal Factor_Severe preeclampsia -0.004808
Maternal Factor_Twin pregnancy -0.008367
Maternal Factor_Undetermined -0.006816
Maternal Factor_Uterine rupture -0.006816
Maternal Factor_preeclampsia -0.004808
Maternal Factor_Prolonged pregnancy \
Underlying Cause_Birth asphyxia 0.160128
Underlying Cause_Intrauterine hypoxia -0.108003
Underlying Cause_Undetermined -0.027271
Maternal Factor_Abruptio placenta -0.004808
Maternal Factor_Abruption placenta -0.008367
... ...
Maternal Factor_Severe preeclampsia -0.004808
Maternal Factor_Twin pregnancy -0.008367
Maternal Factor_Undetermined -0.006816
Maternal Factor_Uterine rupture -0.006816
Maternal Factor_preeclampsia -0.004808
Maternal Factor_Severe preeclampsia \
Underlying Cause_Birth asphyxia -0.030024
Underlying Cause_Intrauterine hypoxia 0.044515
Underlying Cause_Undetermined -0.027271
Maternal Factor_Abruptio placenta -0.004808
Maternal Factor_Abruption placenta -0.008367
... ...
Maternal Factor_Severe preeclampsia 1.000000
Maternal Factor_Twin pregnancy -0.008367
Maternal Factor_Undetermined -0.006816
Maternal Factor_Uterine rupture -0.006816
Maternal Factor_preeclampsia -0.004808
Maternal Factor_Twin pregnancy \
Underlying Cause_Birth asphyxia -0.052255
Underlying Cause_Intrauterine hypoxia 0.077475
Underlying Cause_Undetermined -0.047464
Maternal Factor_Abruptio placenta -0.008367
Maternal Factor_Abruption placenta -0.014563
... ...
Maternal Factor_Severe preeclampsia -0.008367
Maternal Factor_Twin pregnancy 1.000000
Maternal Factor_Undetermined -0.011862
Maternal Factor_Uterine rupture -0.011862
Maternal Factor_preeclampsia -0.008367
Maternal Factor_Undetermined \
Underlying Cause_Birth asphyxia -0.042563
Underlying Cause_Intrauterine hypoxia -0.153107
Underlying Cause_Undetermined 0.249914
Maternal Factor_Abruptio placenta -0.006816
Maternal Factor_Abruption placenta -0.011862
... ...
Maternal Factor_Severe preeclampsia -0.006816
Maternal Factor_Twin pregnancy -0.011862
Maternal Factor_Undetermined 1.000000
Maternal Factor_Uterine rupture -0.009662
Maternal Factor_preeclampsia -0.006816
Maternal Factor_Uterine rupture \
Underlying Cause_Birth asphyxia 0.092219
Underlying Cause_Intrauterine hypoxia -0.045001
Underlying Cause_Undetermined -0.038661
Maternal Factor_Abruptio placenta -0.006816
Maternal Factor_Abruption placenta -0.011862
... ...
Maternal Factor_Severe preeclampsia -0.006816
Maternal Factor_Twin pregnancy -0.011862
Maternal Factor_Undetermined -0.009662
Maternal Factor_Uterine rupture 1.000000
Maternal Factor_preeclampsia -0.006816
Maternal Factor_preeclampsia
Underlying Cause_Birth asphyxia -0.030024
Underlying Cause_Intrauterine hypoxia 0.044515
Underlying Cause_Undetermined -0.027271
Maternal Factor_Abruptio placenta -0.004808
Maternal Factor_Abruption placenta -0.008367
... ...
Maternal Factor_Severe preeclampsia -0.004808
Maternal Factor_Twin pregnancy -0.008367
Maternal Factor_Undetermined -0.006816
Maternal Factor_Uterine rupture -0.006816
Maternal Factor_preeclampsia 1.000000
[63 rows x 63 columns]
Underlying Cause_Birth asphyxia \
Underlying Cause_Birth asphyxia 1.000000
Underlying Cause_Intrauterine hypoxia -0.674476
Underlying Cause_Undetermined -0.170310
Maternal Factor_Abruptio placenta 0.160128
Maternal Factor_Abruption placenta 0.058061
... ...
Maternal Factor_Severe preeclampsia -0.030024
Maternal Factor_Twin pregnancy -0.052255
Maternal Factor_Undetermined -0.042563
Maternal Factor_Uterine rupture 0.092219
Maternal Factor_preeclampsia -0.030024
Underlying Cause_Intrauterine hypoxia \
Underlying Cause_Birth asphyxia -0.674476
Underlying Cause_Intrauterine hypoxia 1.000000
Underlying Cause_Undetermined -0.612640
Maternal Factor_Abruptio placenta -0.108003
Maternal Factor_Abruption placenta -0.011007
... ...
Maternal Factor_Severe preeclampsia 0.044515
Maternal Factor_Twin pregnancy 0.077475
Maternal Factor_Undetermined -0.153107
Maternal Factor_Uterine rupture -0.045001
Maternal Factor_preeclampsia 0.044515
Underlying Cause_Undetermined \
Underlying Cause_Birth asphyxia -0.170310
Underlying Cause_Intrauterine hypoxia -0.612640
Underlying Cause_Undetermined 1.000000
Maternal Factor_Abruptio placenta -0.027271
Maternal Factor_Abruption placenta -0.047464
... ...
Maternal Factor_Severe preeclampsia -0.027271
Maternal Factor_Twin pregnancy -0.047464
Maternal Factor_Undetermined 0.249914
Maternal Factor_Uterine rupture -0.038661
Maternal Factor_preeclampsia -0.027271
Maternal Factor_Undetermined
Underlying Cause_Birth asphyxia -0.042563
Underlying Cause_Intrauterine hypoxia -0.153107
Underlying Cause_Undetermined 0.249914
Maternal Factor_Abruptio placenta -0.006816
Maternal Factor_Abruption placenta -0.011862
... ...
Maternal Factor_Severe preeclampsia -0.006816
Maternal Factor_Twin pregnancy -0.011862
Maternal Factor_Undetermined 1.000000
Maternal Factor_Uterine rupture -0.009662
Maternal Factor_preeclampsia -0.006816
[63 rows x 4 columns]
*****Insight from the above | correlation Analysis*****
- Correlation analysis has made on infant underlying conditions and maternal factors how they are correlated to the top three causes of the child death.
- In most case the correlation result shows negative realtion among the variables. Some positive relation few with no realtions are observed. here are few illustrations:
- Maternal Factor_Abruptio placenta \ Underlying Cause_Birth asphyxia
0.160128 - Maternal Factor_Antepartum hemorrhage \Underlying Cause_Birth asphyxia
0.092219 - Maternal Factor_Prolonged pregnancy \ Underlying Cause_Birth asphyxia
0.160128
- Maternal Factor_Abruptio placenta \ Underlying Cause_Birth asphyxia
4.Feature engineering¶
You are expected to select the top infant underlying causes and maternal factors(features) that would contribute to the top three causes of child death identified under 2(A) above. For this, you need to select the best and likely features. In doing so:
- A. Select the classification models LogisticRegression, Support Vector Machine, AdaBoostClassifier, Random Forest Classifier , Gradient Boosting Classifier and XGBOOST and train each on the dataset
- B. Import the appropriate package for each of the classification models above
- C. Rank the features based on their importance for each of the top underlying causes of child death identified above under 2(A), for each of the classification algorithms under (A )
[ 0.29264313 0.20315778 -0.09172598 0.20315778 -0.09172598 0.19656664 -0.09172598 -0.09172598 0. -0.12947367 0. 0. -0.12947367 0. -0.09172598 -0.09172598 -0.09172598 -0.09172598 -0.09172598 -0.12947367 -0.09172598 -0.09172598 0. -0.09172598 -0.09172598 -0.09172598 0.29264313 0. -0.09172598 0. 0.20315778 -0.33920149 -0.12947367 -0.09172598 0.29264313 -0.09172598 -0.09172598 -0.09172598 -0.12947367 -0.09172598 -0.09172598 0. -0.12947367 0. 0. 0.29264313 -0.09172598 -0.09172598 0. -0.09430552 -0.09172598 -0.09172598 0.28805891 -0.09172598 0.29264313 -0.09172598 -0.12947367 -0.13310848 0.20315778 0. ] [ 1.64953350e-01 8.88178420e-16 -2.22044605e-16 -8.88178420e-16 -2.22044605e-16 -8.88178420e-16 -2.22044605e-16 -2.22044605e-16 0.00000000e+00 4.44089210e-16 0.00000000e+00 0.00000000e+00 4.44089210e-16 0.00000000e+00 -2.22044605e-16 2.22044605e-16 -2.22044605e-16 -2.22044605e-16 -2.22044605e-16 4.44089210e-16 -2.22044605e-16 -2.22044605e-16 0.00000000e+00 -2.22044605e-16 -2.22044605e-16 -2.22044605e-16 1.64953350e-01 0.00000000e+00 -2.22044605e-16 0.00000000e+00 1.77635684e-15 8.88178420e-16 4.44089210e-16 -2.22044605e-16 1.64953350e-01 -2.22044605e-16 -2.22044605e-16 -2.22044605e-16 4.44089210e-16 -2.22044605e-16 -2.22044605e-16 0.00000000e+00 4.44089210e-16 0.00000000e+00 0.00000000e+00 1.64953350e-01 -2.22044605e-16 -2.22044605e-16 0.00000000e+00 -2.22044605e-16 -2.22044605e-16 -2.22044605e-16 4.44089210e-16 -2.22044605e-16 1.64953350e-01 -2.22044605e-16 4.44089210e-16 4.44089210e-16 8.88178420e-16 0.00000000e+00]
****Insight from the above | Feature engineering****
- Feature engineering and feature importance has been made accordingly and ranked as well. So, the feature importance (3) for each classifier is ranked as follows in decreasing order:
- Logistic Regression
- Maternal Factor_Fetus and newborn affected
- Maternal Factor_Prolonged pregnancy
- Maternal Factor_Abruptio placenta
- Support Vector Machine
- Maternal Factor_Abruptio placenta
- Maternal Factor_Fetus and newborn affected
- Maternal Factor_Prolonged pregnancy
- AdaBoost
- Maternal Factor_Eclampsia
- Maternal Factor_Preeclampsia
- Maternal Factor_Abruptio placenta
- Random Forest
- Maternal Factor_Undetermined
- Maternal Factor_Preeclampsia
- Maternal Factor_Fetus and newborn affected
- Gradient Boosting
- Maternal Factor_Undetermined
- Maternal Factor_Preeclampsia
- Maternal Factor_Prolonged pregnancy
- XGBoost
- Maternal Factor_Preeclampsia
- Maternal Factor_Fetus and newborn affected
- Maternal Factor_Eclampsia
- to summarize, there are features which a zero and negative importance levels as it is shown in the above graph
5. Model evaluation using the proper metrics¶
- A. Import the appropriate evaluation metric packages
- B. Using the appropriate n-fold cross validation and out of sample data, select the best preforming model from the candidate models under 4(A)
- C. Ensemble the models and see the performance of the combination models on the data
- D. Use Accuracy score metrics to evaluate the performance of the models above
- E. Plot the AUC and ROC curve on the same graph to visualize and compare the performance of each of the models above
Logistic Regression: 0.6575 ± 0.0307 Support Vector Machine: 0.6575 ± 0.0307 AdaBoost: 0.6575 ± 0.0307 Random Forest: 0.6575 ± 0.0307 Gradient Boosting: 0.6575 ± 0.0307 XGBoost: 0.6644 ± 0.0098 Best Model: XGBoost
VotingClassifier(estimators=[('lr', LogisticRegression(max_iter=1000)),
('svm', SVC(kernel='linear', probability=True)),
('adb', AdaBoostClassifier()),
('rf', RandomForestClassifier()),
('gb', GradientBoostingClassifier()),
('xgb',
XGBClassifier(base_score=0.5, booster='gbtree',
colsample_bylevel=1,
colsample_bynode=1,
colsample_bytree=1,
enable_categorical=False,
e...
learning_rate=0.300000012,
max_delta_step=0, max_depth=6,
min_child_weight=1, missing=nan,
monotone_constraints='()',
n_estimators=100, n_jobs=8,
num_parallel_tree=1,
objective='multi:softprob',
predictor='auto', random_state=0,
reg_alpha=0, reg_lambda=1,
scale_pos_weight=None, subsample=1,
tree_method='exact',
use_label_encoder=False,
validate_parameters=1, ...))],
voting='soft')
Ensemble Model Accuracy: 0.8095
Accuracy Score Metrics to Evaluate the Performance of the Models¶
Logistic Regression Test Accuracy: 0.8095 Support Vector Machine Test Accuracy: 0.8095 AdaBoost Test Accuracy: 0.7778 Random Forest Test Accuracy: 0.8095 Gradient Boosting Test Accuracy: 0.8095 XGBoost Test Accuracy: 0.8095 Ensemble Model Test Accuracy: 0.8095
Plot the AUC and ROC Curve to Visualize and Compare Performance¶
B. Plot the top five infant underlying causes of the child death¶
C. Plot the top five maternal factors contributing to the child death¶
*****Insight from the above | Maternal Factor*****
- The above descriptive summary and the bar graph shows clear and precise information about the contribution of Maternal Factor to infant death . Accordingly here is few summary given below to make it understandable.
Preeclampsia, is the most and far highest maternal factor for infant death and it covers 18% of the total deaths contribution.Twin pregnancyis the second most maternal factor the for the infant death and it covers 6% of the total deaths.Fetus and newborn affected by other formscontributes 5% of the infant death which is ranked thirdly accordingto the given dataset.- Next,
Eclampsiais the maternal factor for the infant death which contributes around 4%.
- In summary considering the magnitude and proportion of the Maternal Factor contribution to infant death, it requiers a spect special attention to reduce the infant death contributed by
Preeclampsia.
D. Plot the child death based on the case types¶
*****Insight from the above | Case Type*****
- The above descriptive summary and the pie chart shows clear and precise information about the case type in relation to infant death . Accordingly, here is few summary given below to make it ease.
Stillbirth, is the first most highest case type in infant death and it accounts 53% of the total deaths cases.Death in the first 24 hoursis the second most case type in the infant death and it accounts 15% of the total deaths cases.Early Neonate (1 to 6 days)accounts 11% in the infant death which is ranked thirdly accordingto the given dataset.- Next,
Child (12 months to less than 60 months)accounts 9% in the infant death.
- In summary considering the magnitude and proportion of the case type in relation to infant death, it requiers a specila research and study to mitigate the problem behind the case type
Stillbirth.